Loop nest optimization

Results: 25



#Item
1Computing / Computer memory / Computer hardware / Computer architecture / Cell / Direct memory access / Emotion Engine / Sparse matrix / Scratchpad memory / Loop nest optimization / CPU cache / SIMD

Scientific Computing Kernels on the Cell Processor Samuel Williams, John Shalf, Leonid Oliker Shoaib Kamil, Parry Husbands, Katherine Yelick Computational Research Division Lawrence Berkeley National Laboratory Berkeley,

Add to Reading List

Source URL: crd.lbl.gov

Language: English - Date: 2012-09-07 00:14:17
2Compiler optimizations / Computing / Software engineering / Software / Loop nest optimization / Stencil code / Roofline model / Stencil / Program optimization / Common subexpression elimination / CPU cache / Scalable locality

Auto-tuning the 27-point Stencil for Multicore Kaushik Datta2 , Samuel Williams1 , Vasily Volkov2 , Jonathan Carter1 , Leonid Oliker1 , John Shalf1 , and Katherine Yelick1 1 CRD/NERSC, Lawrence Berkeley National Laborat

Add to Reading List

Source URL: iwapt.org

Language: English - Date: 2009-08-03 20:59:23
3Computing / Software engineering / Compiler optimizations / Computer programming / Loop optimization / Automatic parallelization / Loop nest optimization / CPU cache / Program optimization / Software pipelining / Granularity / Lookup table

Performance Portable Optimizations for Loops Containing Communication Operations Costin Iancu Wei Chen, Katherine Yelick

Add to Reading List

Source URL: crd.lbl.gov

Language: English - Date: 2012-10-24 14:34:01
4Compiler optimizations / Loop optimization / Automatic parallelization / Loop nest optimization / CPU cache / Program optimization / Software pipelining / Lookup table / Distributed computing / Fortran / Compiler / Algorithm

Performance Portable Optimizations for Loops Containing Communication Operations Costin Iancu Wei Chen, Katherine Yelick

Add to Reading List

Source URL: crd.lbl.gov

Language: English - Date: 2012-10-24 14:33:15
5Computer memory / Cache / Computer architecture / Compiler optimizations / CPU cache / Central processing unit / Opteron / Cell / Sparse matrix-vector multiplication / Loop nest optimization / Multi-core processor / Advanced Micro Devices

Optimization of Sparse Matrix-Vector Multiplication on Emerging Multicore Platforms Samuel Williams∗†, Leonid Oliker∗, Richard Vuduc§, John Shalf∗, Katherine Yelick∗†, James Demmel† ∗ CRD/NERSC, Lawrenc

Add to Reading List

Source URL: crd.lbl.gov

Language: English - Date: 2012-09-07 00:12:17
6Compiler optimizations / Numerical linear algebra / Cache / Computer memory / Locality of reference / Software optimization / Matrix multiplication / Loop optimization / Matrix / Array data type / Loop nest optimization / Loop tiling

CS:APP2e Web Aside MEM:BLOCKING: Using Blocking to Increase Temporal Locality∗ Randal E. Bryant David R. O’Hallaron June 5, 2012

Add to Reading List

Source URL: csapp.cs.cmu.edu

Language: English - Date: 2012-06-05 05:40:07
7Parallel computing / Numerical linear algebra / Software optimization / Roofline model / Software testing / Matrix multiplication / Multi-core processor / FLOPS / Loop nest optimization / Matrix multiplication algorithm

Design of Parallel and High Performance Computing HS 2013 Markus P¨ uschel, Torsten Hoefler Department of Computer Science ETH Zurich

Add to Reading List

Source URL: spcl.inf.ethz.ch

Language: English - Date: 2013-11-29 06:51:17
8Computer architecture / Cache / Central processing unit / Microprocessors / Computer memory / CPU cache / Stencil code / Loop nest optimization / Opteron / POWER5 / Cell / Multi-core processor

OPTIMIZATION AND PERFORMANCE MODELING OF STENCIL COMPUTATIONS ON MODERN MICROPROCESSORS‡ KAUSHIK DATTA†, SHOAIB KAMIL∗†, SAMUEL WILLIAMS∗†, LEONID OLIKER∗, JOHN SHALF∗, KATHERINE YELICK∗† Abstract. St

Add to Reading List

Source URL: crd.lbl.gov

Language: English - Date: 2012-09-06 23:58:43
9Compiler optimizations / Loop nest optimization / Stencil code / Roofline model / Program optimization / Stencil / CPU cache / Common subexpression elimination / Scalable locality

Auto-tuning the 27-point Stencil for Multicore Kaushik Datta2 , Samuel Williams1 , Vasily Volkov2 , Jonathan Carter1 , Leonid Oliker1 , John Shalf1 , and Katherine Yelick1 1 CRD/NERSC, Lawrence Berkeley National Laborat

Add to Reading List

Source URL: crd.lbl.gov

Language: English - Date: 2012-09-06 23:44:43
10Cache / Computer memory / Computer architecture / Central processing unit / Virtual memory / CPU cache / Locality of reference / Loop interchange / Memory hierarchy / Loop nest optimization / Thrashing / Page table

Predicting Memory-Access Cost Based on Data-Access Patterns Surendra Byna Xian-He Sun Illinois Institute of Technology {renbyna, sun}@iit.edu Abstract

Add to Reading List

Source URL: wgropp.cs.illinois.edu

Language: English - Date: 2016-08-16 11:52:12
UPDATE